# Efficient deployment

Gemma 3 4b It Quantized.w4a16
A quantized version based on google/gemma-3-4b-it, using INT4 weight quantization and FP16 activation quantization to optimize inference efficiency
Image-to-Text Transformers
G
RedHatAI
195
1
GLM 4 32B 0414 4bit DWQ
MIT
This is the MLX format version of the THUDM/GLM-4-32B-0414 model, processed with 4-bit DWQ quantization, suitable for efficient inference on Apple silicon devices.
Large Language Model Supports Multiple Languages
G
mlx-community
156
4
Spec T1 RL 7B
MIT
Spec-T1-RL-7B is a high-precision large language model focused on mathematical reasoning, algorithm problem-solving, and code generation, and it performs excellently in technical benchmark tests.
Large Language Model Safetensors English
S
SVECTOR-CORPORATION
4,626
6
Qwen3 30B A3B Gptq 8bit
Apache-2.0
Qwen3 30B A3B is a large language model that has undergone 8-bit quantization using the GPTQ method, suitable for efficient inference scenarios.
Large Language Model Transformers
Q
btbtyler09
301
2
Whisper Large V3 Turbo Quantized.w4a16
Apache-2.0
An INT4 weight quantization version based on openai/whisper-large-v3-turbo, supporting efficient audio-to-text tasks
Speech Recognition Transformers English
W
RedHatAI
1,851
2
Llama 2 7b Chat Hf GGUF
Llama 2 is a 7B-parameter large language model developed by Meta, offering multiple quantization versions to accommodate different hardware requirements.
Large Language Model English
L
Mungert
1,348
3
Qwq 32B Bnb 4bit
Apache-2.0
4-bit quantized version of QwQ-32B, optimized using Bitsandbytes technology, suitable for efficient inference in resource-constrained environments
Large Language Model Transformers
Q
onekq-ai
167
2
Llama 3 8B Instruct GPTQ 4 Bit
Other
This is a 4-bit quantized GPTQ model based on Meta Llama 3, quantized by Astronomer, capable of efficient operation on low-VRAM devices.
Large Language Model Transformers
L
astronomer
2,059
25
Moritzlaurer Roberta Base Zeroshot V2.0 C Onnx
Apache-2.0
This is the ONNX format conversion of the MoritzLaurer/roberta-base-zeroshot-v2.0-c model, suitable for zero-shot classification tasks.
Text Classification Transformers
M
protectai
14.94k
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase